Content

  1. Brief Introduction
  2. Details About Data
    • Data Sources
    • Data Fields
  3. Main Technologies Used
  4. Data Transformation
    • Main Libraries
    • Data Wrangling
  5. Shiny Application
    • Structure of ui
    • Structure of server
    • Tables
    • Graphs
  6. Deployment
  7. References

1. Brief Introduction

This dashboard is built in R Shiny and used R Markdown to publish on this website. This is the documentation of web application that describes technical details of how the application was built. How was data cleaned and organized to load for the visualizations of this dashboard.

How to see information on the website:

2. Details About Data

Details about the dataset can be found here: data.gov.sg

3. Main Technologies Used

4. Data Transformation

Please refer to DataWrangling.R file for more details.

Main Libraries

First, we import the dataset. We are using DT package to display an interactive table that fits into the page.

Data Fields

data <- read.csv('data/employment_data.csv')
datatable(data, rownames = FALSE, filter="top", class = "table", options = list(pageLength = 5, scrollX=T) )

Check the dimension of the dataframe.

dim(data)
## [1] 703  12

First step we convert the following variables from factor into numeric:

  • employment_rate_overall
  • employment_rate_ft_perm
  • basic_monthly_mean
  • basic_monthly_median
  • gross_monthly_mean
  • gross_monthly_median
  • gross_mthly_25_percentile
  • gross_mthly_75_percentile

Data Wrangling

data[,5:12] <- apply(data[,5:12], 2, function(x) as.numeric(as.character(x)))
str(data)
## 'data.frame':    703 obs. of  12 variables:
##  $ year                     : int  2013 2013 2013 2013 2013 2013 2013 2013 2013 2013 ...
##  $ university               : chr  "Nanyang Technological University" "Nanyang Technological University" "Nanyang Technological University" "Nanyang Technological University" ...
##  $ school                   : chr  "College of Business (Nanyang Business School)" "College of Business (Nanyang Business School)" "College of Business (Nanyang Business School)" "College of Business (Nanyang Business School)" ...
##  $ degree                   : chr  "Accountancy and Business" "Accountancy (3-yr direct Honours Programme)" "Business (3-yr direct Honours Programme)" "Business and Computing" ...
##  $ employment_rate_overall  : num  97.4 97.1 90.9 87.5 95.3 81.3 87.3 90.3 94.8 92.1 ...
##  $ employment_rate_ft_perm  : num  96.1 95.7 85.7 87.5 95.3 68.8 85.1 88.2 93.8 88.5 ...
##  $ basic_monthly_mean       : num  3701 2850 3053 3557 3494 ...
##  $ basic_monthly_median     : num  3200 2700 3000 3400 3500 2900 3000 3100 3000 3000 ...
##  $ gross_monthly_mean       : num  3727 2938 3214 3615 3536 ...
##  $ gross_monthly_median     : num  3350 2700 3000 3400 3500 ...
##  $ gross_mthly_25_percentile: num  2900 2700 2700 3000 3100 ...
##  $ gross_mthly_75_percentile: num  4000 2900 3500 4100 3816 ...

Check how many missing values in each column.

colSums(sapply(data, is.na)) %>% 
  kable()
x
year 0
university 0
school 0
degree 0
employment_rate_overall 73
employment_rate_ft_perm 73
basic_monthly_mean 73
basic_monthly_median 73
gross_monthly_mean 73
gross_monthly_median 73
gross_mthly_25_percentile 73
gross_mthly_75_percentile 73

Removing the rows that contain missing values, and check the dimension again.

data <- drop_na(data)
dim(data)
## [1] 630  12

Let’s take a look at how many records we have for each university.

table(data$university) %>% 
  kable(.,col.names = c('university', 'count'))
university count
Nanyang Technological University 204
National University of Singapore 207
Singapore Institute of Technology 135
Singapore Management University 72
Singapore University of Social Sciences 3
Singapore University of Technology and Design 9

Let’s compare the median monthly income between universities. From the graph we can see that on average, Singapore University of Technology and Design(SUTD) graduates have a better salary (however we have only 9 records so this is likely to be biased), then comes Singapore Management University. The top universities, National University of Singapore and Nanyang Technological University graduates are at 3rd and 4th places, respectively, among 6 universities,

p <- ggplot(data, aes(x=university, y=basic_monthly_median)) + 
      geom_boxplot(fill="steelblue", alpha=0.5) + 
      xlab("University") + ylab("Basic Montly Median of Graduates")

p + coord_flip()

Let’s dive deeper into the details of Nanyang Technological University and list the basic monthly median income for each school. And we can see from the plot that graduates from College of Arts barely get over 3k monthly income. This is consistant with our experiences that usually students from arts or social sciences schools get less pay than their colleagues with engineering or business background.

monthlyNTU <- data %>% 
                filter(university=="Nanyang Technological University") %>% 
                group_by(school) %>% 
                summarise_at(.vars = names(.)[7:8],.funs = c(mean="mean"))

p <- ggplot(data.frame(monthlyNTU), aes(x=reorder(school,basic_monthly_median_mean), 
                                        y=basic_monthly_median_mean)) + 
          geom_bar(stat="identity", fill="steelblue", alpha=0.5) +
          xlab("Schools") + ylab("Basic Montly Median of Graduates from NTU")
p + coord_flip() 

The following graph shows the density of the employment rate for each university.

ggplot(data, aes(x=employment_rate_ft_perm)) + geom_density(aes(colour = university)) + 
  xlab("Employment rate for Permenant Positions")

5. Shiny Application

Structure of ui

Please refer to ui.R file for more details. Description of layout.

Structure of server

Please refer to server.R file for more details.

Tables

Graphs

6. Deployment

Here we will describe a few steps how we launched on webserver.

7. References